finite union
From Universal Approximation Theorem to Tropical Geometry of Multi-Layer Perceptrons
We revisit the Universal Approximation Theorem(UAT) through the lens of the tropical geometry of neural networks and introduce a constructive, geometry-aware initialization for sigmoidal multi-layer perceptrons (MLPs). Tropical geometry shows that Rectified Linear Unit (ReLU) networks admit decision functions with a combinatorial structure often described as a tropical rational, namely a difference of tropical polynomials. Focusing on planar binary classification, we design purely sigmoidal MLPs that adhere to the finite-sum format of UAT: a finite linear combination of shifted and scaled sigmoids of affine functions. The resulting models yield decision boundaries that already align with prescribed shapes at initialization and can be refined by standard training if desired. This provides a practical bridge between the tropical perspective and smooth MLPs, enabling interpretable, shape-driven initialization without resorting to ReLU architectures. We focus on the construction and empirical demonstrations in two dimensions; theoretical analysis and higher-dimensional extensions are left for future work.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Taiwan > Taiwan Province > Taipei (0.04)
The Measure of Deception: An Analysis of Data Forging in Machine Unlearning
Dixit, Rishabh, Hui, Yuan, Saab, Rayan
Motivated by privacy regulations and the need to mitigate the effects of harmful data, machine unlearning seeks to modify trained models so that they effectively ``forget'' designated data. A key challenge in verifying unlearning is forging -- adversarially crafting data that mimics the gradient of a target point, thereby creating the appearance of unlearning without actually removing information. To capture this phenomenon, we consider the collection of data points whose gradients approximate a target gradient within tolerance $ε$ -- which we call an $ε$-forging set -- and develop a framework for its analysis. For linear regression and one-layer neural networks, we show that the Lebesgue measure of this set is small. It scales on the order of $ε$, and when $ε$ is small enough, $ε^d$. More generally, under mild regularity assumptions, we prove that the forging set measure decays as $ε^{(d-r)/2}$, where $d$ is the data dimension and $r
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Czechia > Prague (0.04)
Linear Independence of Generalized Neurons and Related Functions
The linear independence of neurons plays a significant role in theoretical analysis of neural networks. Specifically, given neurons $H_1, ..., H_n: \bR^N \times \bR^d \to \bR$, we are interested in the following question: when are $\{H_1(\theta_1, \cdot), ..., H_n(\theta_n, \cdot)\}$ are linearly independent as the parameters $\theta_1, ..., \theta_n$ of these functions vary over $\bR^N$. Previous works give a complete characterization of two-layer neurons without bias, for generic smooth activation functions. In this paper, we study the problem for neurons with arbitrary layers and widths, giving a simple but complete characterization for generic analytic activation functions.
- North America > United States > New York (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Geometry of Critical Sets and Existence of Saddle Branches for Two-layer Neural Networks
Zhang, Leyang, Zhang, Yaoyu, Luo, Tao
This paper presents a comprehensive analysis of critical point sets in two-layer neural networks. To study such complex entities, we introduce the critical embedding operator and critical reduction operator as our tools. Given a critical point, we use these operators to uncover the whole underlying critical set representing the same output function, which exhibits a hierarchical structure. Furthermore, we prove existence of saddle branches for any critical set whose output function can be represented by a narrower network. Our results provide a solid foundation to the further study of optimization and training behavior of neural networks.
- Asia > China > Shanghai > Shanghai (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland (0.04)
Cutting a Cake Is Not Always a 'Piece of Cake': A Closer Look at the Foundations of Cake-Cutting Through the Lens of Measure Theory
Kern, Peter, Neugebauer, Daniel, Rothe, Jörg, Schilling, René L., Stoyan, Dietrich, Weishaupt, Robin
Since the groundbreaking work of Steinhaus (1948), cake-cutting is a metaphor for the so-called fair division problem for a divisible, heterogeneous good, which addresses the problem to split a contested quantity (a'cake') in a fair way among several parties A, B, C,...; each party may have its own idea about the value of the different parts of the cake. While mainly mathematicians and economists were concerned with the study of cake-cutting early on, "in recent years, cake cutting has emerged as a major research topic in artificial intelligence," as Balkanski et al. (2014, p. 567) note. They substantiate their claim by listing ten papers on cake-cutting five of which appeared in AAAI (e.g., Cohler et al. (2011)), three in IJCAI (e.g., Procaccia (2009)), and the remaining two in AAMAS proceedings (e.g., Aumann et al. (2013)). For more than a decade now, AAAI and IJCAI (the two top AI conferences) and AAMAS (the leading venue for research on multiagent systems) have published numerous research papers on fair division and, in particular, on cake-cutting. Balkanski et al. (2014, p. 567) go on to write, "The growing interest in cake cutting, and fair division more broadly, is partly motivated by potential applications in AI, such as industrial procurement, manufacturing and scheduling, and airport traffic management (Chevaleyre et al., 2006).
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.04)
- North America > United States > New York (0.04)
- (2 more...)
The Expressive Power of a Class of Normalizing Flow Models
Kong, Zhifeng, Chaudhuri, Kamalika
Normalizing flows have received a great deal of recent attention as they allow flexible generative modeling as well as easy likelihood computation. While a wide variety of flow models have been proposed, there is little formal understanding of the representation power of these models. In this work, we study some basic normalizing flows and rigorously establish bounds on their expressive power. Our results indicate that while these flows are highly expressive in one dimension, in higher dimensions their representation power may be limited, especially when the flows have moderate depth.
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Italy (0.04)